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Abstract 

We investigate whether quantum history theories can be consistent 
with Bayesian reasoning and whether such an analysis helps clarify the 
interpretation of such theories. First, we summarise and extend recent 
work categorising two different approaches to formalising multi-time mea- 
surements in quantum theory. The standard approach consists of describ- 
ing an ordered series of measurements in terms of history propositions 
with non- additive 'probabilities'. The non-standard approach consists of 
defining multi-time measurements to consist of sets of exclusive and ex- 
haustive history propositions and recovering the single-time exclusivity 
of results when discussing single-time history propositions. We analyse 
whether such history propositions can be consistent with Bayes' rule. We 
show that certain class of histories are given a natural Bayesian inter- 
pretation, namely the linearly positive histories originally introduced by 
Goldstein and Page. Thus we argue that this gives a certain amount of 
interpretational clarity to the non-standard approach. We also attempt a 
justification of our analysis using Cox's axioms of probability theory. 

Keywords: Bayesian Probability, Consistent Histories, Linear Positivity 
PACS: 02.50.Cw, 02.50.Tt, 03.67.-a, 03.65. Ca. 

Outline 

The basic premise of this paper is rather simple. We propose to apply Bayesian 
probability rules to quantum histories theory and see if we get any form of 
consistency. The are a few reasons why this is a pedagogically useful tack to 
take. Firstly, Bayesian probability is pedagogically useful in its own right as 
it provides a framework for thinking about probabilities that is rather natu- 
ral in a human sense — it accommodates, in different situations, all uses of the 
term 'probability' including probabilistic inference and relative frequencies pQ. 
Secondly, quantum history theories are specifically designed with the idea of 
applying such probabilities to closed systems, without necessarily discussing ob- 
servers and their experiments; thus it is natural to interpret such probabilities 
in a Bayesian manner rather than necessarily discussing the relative frequencies 
of experiments. In fact Bayesian probability can accommodate almost all no- 
tions of relative frequency presently used in the literature pQ , whereas theories 
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of relative frequency have to be designed for the problem at hand. Thirdly, 
even when discussing quantum histories instrumentally such a Bayesian inter- 
pretation might help to clarify the interpretation of Standard Quantum Theory 
(SQT) [2]. It will turn out that we can apply a very natural Bayesian interpre- 
tation to a certain class of quantum histories. So, although it may seem naive 
at first we will get out something quite profound, a natural interpretation of the 
probabilities of certain history propositions. In fact, as we will show, we can 
justify our analysis using Cox's axioms of probability theory [3] to show that 
the standard notions of probability in the consistent histories programme aren't 
necessarily 'good' notions of probability, but there are alternative notions. 

Before we get stuck into our Bayesian analysis, let us briefly discuss why the 
foundational interpretation of probability really matters when interpreting such 
theories. Then we will introduce quantum history theories and try and analyse 
what consistency we can get through Bayesian reasoning. 

Quantum Probabilities 

In Standard Quantum Theory (SQT) one usually invokes multi-time measure- 
ments as a succession of single-time measurements. If we use the von Neumann 
measurement formalism then the possible exclusive propositions at each time 
are represented by the projection operators associated with the eigenstates of 
a Hcrmitian operator. Thus a non-relativistic history is represented as a suc- 
cession of such projection operators each labelled by a time. The probability of 
each such history is, in the standard formalism, given by the probability trace 
formula. For example, for a three-time succession (ti < t% < ts) of von Neumann 
measurements {A(ti), -B(f 2 ), C 1 ^)} given an initial state p, the probability of a 
history {a,-(ti), bj(t 2 ), c k (t 3 )} is given by: 

p(ai,bj,c k \p) =tr(ck(ta)bj(t 2 )ai(ti)pai(ti)bj(t2)). (1) 

In the above equation the results of each measurement are conveniently 
represented by Heisenberg picture projection operators. For example, the results 
of the von Neumann measurement A at time t\ are represented by the set of 
Heisenberg picture projection operators di(t\) — U'(ti — to)cbiU(ti — to) where 
hi are the relevant Schrodingcr picture operators and to is the fiducial time. 
Similarly for the other von Neumann measurements B and C. This is the 
standard way that multi-time measurements are invoked in non-relativistic SQT. 

Classically, the additivity of propositions in Bayesian probability theory is 
a contextual property of propositions. This can be seen in the pedagogical 
example given recently by Mana |2J- Take an urn that contains some red balls 
and some wooden balls; the urn is shaken and an observer takes out a ball. 
We can ask for the probabilities of the following two propositions: "the ball 
is red" and "the ball is wooden" . Only if it is the case that the balls cannot 
be both wooden and red then these two propositions are exclusive. Thus, we 
can see that propositions are not inherently exclusive. Rather, classically at 
least, exclusivity is a contextual property of propositions as there are ways that 
these two propositions could be non-exclusive (say, some balls are both red 
and wooden). Therefore, since there is no mention of contexts in the standard 
analysis, one is not necessarily discussing exclusive propositions when discussing 
the possible history propositions that arise through an ordered succession of von 
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Neumann measurements. Or, one might, ambiguously, be implicitly invoking 
many possible contexts that need to be formally differentiated. 

The only way, classically, we have to define whether two propositions A and 
B are exclusive is to equate exclusivity with the additivity of their probabilities: 

p(AuB\I)=p(A\I)+p(B\I) (2) 

such that A n B = where '0' is the null proposition that is always false in 
standard Boolean logic — when A n B = we say that A and B are disjoint. 
So, if two propositions are both additive and disjoint we will simply call them 
'exclusive' with respect to context /. If this is not satisfied then propositions A 
and B are called 'not-exclusive' with respect to context /. Classical probability 
theory and SQT therefore differ by how they treat 'not-exclusive' propositions. 
Note that we use the term 'exclusive', throughout this paper, in a pedagogically 
distinct way to how it is normally used in the quantum histories literature — 
where exclusive is synonymous with disjoint. We wish to differentiate 'disjoint' 
and 'exclusive' propositions because, classically, exclusive propositions must al- 
ways be additive so we wish to reserve the word 'exclusive' only for contextually 
additive propositions. This means that when we generalise we keep the stan- 
dard notion of exclusivity and are forced to name any other tentative notion 
something else so as to avoid confusion. In standard Bayesian probability the- 
ory exclusive and disjoint are considered equivalent notions (the former being 
about probabilities and the latter about the propositions themselves) but when 
we get into problems with non-additivity we should differentiate these notions. 
Obviously, single-time von Neumann measurements consist of sets of 'exclusive' 
propositions (both additive and disjoint). When we wish to differentiate our 
introduced notion of conventional probabilistic exclusivity from other presumed 
notions of exclusivity we will do so explicitly, otherwise we will simply use the 
term exclusive in the standard contextual probabilistic way we have introduced 
above. When we come to discuss quantum history theories it will turn out 
that exclusive propositions are also disjoint, but disjoint propositions aren't 
necessarily exclusive (since exclusivity is taken to be a contextual probabilistic 
property of propositions whereas disjointness is something that is defined on the 
proposition algebra). 

Here we use standard Bayesian notation such that all probabilities are de- 
fined with respect to a specific context / for exactly the reason noted above: 
propositions A and B are only well-defined in a given context exactly because 
their meaning is contextual 1 . By invoking contexts explicitly we hope to clarify 
the meaning of such statements. Since the probabilities given by are not 
necessarily additive then we are, in the standard interpretation, having to invoke 
a different kind of exclusivity to that invoked when requiring both and dis- 
jointness. The standard von Neumann interpretation of exclusivity comes about 
because each single-time measurement consists of explicitly exclusive proposi- 
tions and it is rather natural (although we argue that it is perhaps dubious) 
to presume that successions of these single-time measurements give well-defined 
exclusive history propositions — even though the non-additivity of the probabil- 
ities of such history propositions suggests that they are not probabilistically 
'exclusive' as we have defined above. 

1 We use the term 'context' in the colloquial manner used by Bayesian theorists rather than 
in the technical sense of the Kochen-Specker theorem. 
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So, as Anastopoulos has argued 0], there is a dichotomy for multi-time 
measurements that we must account for. We have a choice between the two 
following paradigms: 

1. Postulate single-time exclusivity of results in the standard manner and 
presume some naive kind of exclusivity for multi-time propositions that 
arise from a series of single-time measurements. This is the standard 
interpretation of von Neumann measurements. 

2. Postulate the exclusivity of some history propositions, using the standard 
notion of exclusivity of probability theory as we have defined above, and 
get single-time exclusivity of results as a corollary by discussing single-time 
history propositions. 

In what follows we shall refer to these two interpretations as ^ an d HI respec- 
tively. Anastopoulos argues, in g], that neither interpretation of multi-time 
measurements have yet been convincingly promoted. We are, of course, used 
to interpretation ^ and not used to interpretation [2J If we use interpretation^ 
then, as Anastopoulos 0] shows, we are forced to admit a dependency of the 
probabilities (treated as relative frequencies) on the resolution of the apparatus 
we use, exactly because such 'probabilities' are non-additive and thus aren't ex- 
clusive in the conventional sense (nor are they not-exclusive in the conventional 
sense). So, if we use finer-grained projection operators we get different proba- 
bilities out for given sample sets. It is interesting to investigate interpretation 
13 simply because it is not usually considered and cannot be rejected a priori. 

It is the conflict implicit in the noted dichotomy which makes us so un- 
comfortable with multi-time measurements. In interpretation ^ any ordered 
set of measurements, presuming that the relevant apparatus can be made, is 
well-defined — this suggests an amazing amount of freedom that nature gives to 
experimental physicists. 

A Pedagogical Account of Consistent Histories 

There does exist a quantum formalism that implicitly uses interpretation 
namely the Consistent Histories (CH) programme El IE] ■ Rarely, however, 
is interpretation[2]cxplicitly used by consistent historians. Rather, CH is usually 
invoked in a non-instrumental fashion (some exceptions to this trend are |1U| 
and [lip. Interpretation [2] is also in opposition to the general claim by some 
consistent historians that CH solves the measurement problem. Interpretation 
|3is a way to re-define measurement rather than solve the measurement problem 
per se. 

Let us give a brief introduction to the CH programme; the basic setup of 
CH is as follows. One defines a set of homogeneous history propositions; follow- 
ing each homogeneous history proposition a consists of an ordered tensor 
product of time-labelled projection operators just like in SQT — for example: 

a ■= ott n (t n ) ® a t „_ 1 (t„-i) (8> ...a t2 (t 2 ) ® a tl (h) (3) 

where each a is a standard single-time projection operator. Here we use the 
Heisenberg picture. The ordered set of times over which an homogeneous history 
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is defined is called its temporal support. We can then naturally define the class 
operator |S] for such a history to be: 



C a ■= at 7l {t n )at n _ 1 (t n - 1 )...a t2 (t 2 )at 1 (t 1 ) 
and the probability formula Q becomes: 



(4) 



p(a\I) = tv(C aP Ct). 



(5) 



It is natural to extend the definition of history propositions to include in- 
homogeneous history propositions [HJ- Inhomogeneous history propositions are 
defined by combining homogeneous history propositions in novel, but rather 
natural, ways. We will not repeat such arguments here (see Isham's original 
work because it is sufficient simply to note the following. One can de- 
fine 'or' and 'not' operations for homogeneous history propositions in a rather 
natural manner; such operations are denoted 'V' and '-i' respectively. These op- 
erations are not the standard notions of 'or' and 'not' in Boolean logic, but are 
defined naturally on the history algebra. The standard 'and' operation 'A' takes 
homogeneous history propositions into homogeneous history propositions and 
behaves exactly like the Boolean 'and' operation should. We can also naturally 
define a notion of disjointness; we denote such a relation '_L'. Note that we have 
explicitly been calling these histories 'propositions'; this is because, in analogy 
with Bayesian probability theory, we are going to treat them as propositions in 
the standard sense to see if we get any consistency via Bayesian reasoning. 

When two homogeneous history propositions a and (3 are disjoint (such that 
they have the same temporal support) then the class operator for the history 
a V f3 is simply: 



We define two history propositions to be 'exclusive' if their probabilities are 
additive under this 'V' operation and such that, in the same context, the prob- 
ability of both being the case is zero. A sufficient condition for two disjoint 
history propositions to have additive probabilities, and thus be exclusive propo- 
sitions, is defined using what is called the decoherence functional d. For SQT 
the decoherence functional acting on two homogeneous history propositions a 
and (3 is defined as follows: 



There is an ambiguity in how we have defined homogeneous history propo- 
sitions because we have used the Heisenberg picture in their definition, but 
obviously one could use Schrodinger picture projection operators and absorb 
all the dynamics into the definition of the decoherence functional. So, the sub- 
scripts p and H refer to such a dependence of the decoherence functional on 
the initial state and the Hamiltonian. We will drop these subscripts from now 
on and such dependence is kept implicit. One can consider the initial state and 
dynamics constant throughtout the following discussion. Obviously, d{a,a) has 
the same form as Q . If we take two homogeneous history propositions a % and 
a? then their respective probabilities (d{a l ,a l ) and d(a J ,a J ) respectively) are 
additive if d(a % ,a J ) — — if this is the case then we will call these two history 
propositions 'exclusive' with respect to a context /. We call the context '/' 



CqV/3 — C a + Cp. 



(6) 



d PiH (a,P) := tr(C a pCl). 



(7) 
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simply to give it a name and invoke a context explicitly — we reserve the right 
to change its name, or use a different context, later. / obviously must specify p 
and H , but it may also specify further information at present left unspecified. 
A set of such history propositions {a 1 : i = 1, 2, N} is called 'd-consistent' [H] 
when all such propositions are mutually 'exclusive' and exhaustive with respect 
to context I. Single-time SQT is recovered by noting that von Neumann single- 
time measurements are d-consistent sets of single-time history propositions. For 
two disjoint histories we have that d(d l A a? , a 1 A a?) = 0. A set of disjoint his- 
tories that form a partition of unity such that d(a l , a 1 ) = 1 is simply called 
a 'complete' set. 

If we use interpretation [3 then it is clear that we could equate multi-time 
measurements with ^-consistent sets. However, rather than call a d-consistent 
set a measurement (which might get quite confusing when discussing the dis- 
tinction between interpretations Q] an dEl) we will call a ^-consistent set a 'null- 
counterfactual'. A null-counterfactual consists of an exclusive and exhaustive 
set of propositions. A counterfactual statement is a statement about what would 
have happened in a different context; a null-counterfactual statement is simply 
the trivial statement about what would happen if the same context was invoked. 
Obviously in quantum theory a null-counterfactual can have many different ex- 
clusive results because of its probabilistic nature ||. So a null-counterfactual 
is almost like a definition of 'context', but we do not wish to use the term 
'context' because of its more technical use in SQT and because, in what fol- 
lows, we use the term in the more colloquial Bayesian manner. If such a von 
Neumann measurement is repeated using an identical setup then one of the 
possible propositions is, exclusively, the case; so, a standard single von Neu- 
mann measurement is a null-counterfactual. The same is considered true for 
null-counterfactuals consisting of more general history propositions. A series of 
von Neumann measurements does not necessarily define a null-counterfactual. 

Null-counterfactuals can also include inhomogeneous history propositions. 
Some history propositions cannot be defined in any <i-consistent set; such his- 
tory propositions are to be called non-d-realisable. A necessary and sufficient 
condition for a history proposition ol % to be d-realisable is thus: 

d(a\a i )+d(-na\^a i ) = l. (8) 

Although it is not yet clear which interpretation which out of ^ or El is 
physically correct, it is pedagogically interesting to investigate interpretation [3 
because it is not usually considered and cannot be rejected a priori j4j. Adopt- 
ing [21 is tempting because of its clear and unambiguous definition of exclusivity 
and null-counterfactual statements. In interpretation ^ one might run an ex- 
periment and a certain history proposition is realised; one may then ask a null- 
counterfactual question: "what history propositions could be realised if you 
repeated the experiment in exactly the same mannerT' and you are forced to 
presume that any distinct history proposition that is realised upon a second run 
is exclusive to the one you first received even though it is not probabilistically 
exclusive in the standard sense. Using null-counterfactuals in interpretation [2] 
one bypasses this problem (as such a definition of exclusivity is uncontroversial) . 
Just as we define a single-time measurement to be some kind of context in which 
an exclusive set of single-time propositions can be realised, so it seems we might 
wish to define a multi-time measurement to be related to contexts in which an 
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exclusive set of history propositions can be realised. 

Just as von Neumann measurements can be convexly mixed we might assume 
that more general null-counterfactuals can be mixed. If we use interpretation 
^ then it is clear that a succession of von Neumann measurements might not 
be defined by a d-consistent set of homogeneous history propositions, but each 
homogeneous history proposition might be d-realisable. In such a case then 
perhaps, one might naively think, we can define such a multi-time measurement 
using interpretation |3 by mixing null-counterfactuals. 

Note that in the CH interpretation of quantum systems the values of prob- 
abilities of history propositions are independent of the d-consistent set they are 
taken to be part of. This is rather analogous to the Gleason non-contextuality 
of single-time SQT ^21- It is exactly this type of non-contextuality that has, in 
the history of SQT, confused the distinction between interpretations ^ an d El 
This is because the independence of the values of probabilities upon contexts 
doesn't necessarily mean that we can disregard the contextual element of their 
very definition. A tautology: probabilities with the same values are not nec- 
essarily the same probabilities — they might need to be distinguished. No two 
equals are the same. 

One can easily imagine a situation where an experimenter mixes a set of von 
Neumann measurements such that she chooses each such measurement with 
a given weight. In such a case it doesn't matter that the propositions realised 
within different measurements are not considered exclusive when taken together, 
they become exclusive only by the application of the mixing process. Similarly 
one might be able to give a good definition to a mixture of null-counterfactuals. 

So, if we take a succession of von Neumann measurements and label the 
possible history propositions as {a 1 : i — 1,2..., TV} then this set need not be 
d-consistent but if all the a 1 are d-realisable then the sets {a\^a 1 } will be 
d-consistent for each i = 1,2, ...,N. Let us denote p(a l \I) = d{a l ,d l ) in order 
to emphasise the probabilistic interpretation of the decoherence functional. If 
we mix these null-counterfactuals we must assign weights uii to each such d- 
consistent set {a 1 , -^a 1 }, just as we would if we were mixing a set of von Neumann 
measurements. Presuming that the context / is the same for each element of 
the mixture (we reserve the right to change this assumption later but it is easy 
to assume that the proposition a 1 V ->a l is equivalent to the proposition a J V ->a J 
since they are normally considered equivalent tautologies, our mixture M being 
a weighted set of different tautologies in this case) then the probability for any 
history a % to be received given such a mixed set M is given by the following: 

N N 

p{a l \M) = w lP {a l \I) + WjP{a l A a j \I) + ^ WjP(a l A ^a j \I). (9) 

If we equate p(ot % A a^\I) with d(a l A a? , a 1 A a?) then we must note that 
d{a l /\a J 1 d l /\a J ) equals zero for all disjoint homogeneous history propositions 
a 1 and a? which are defined over the same temporal support. This is because, 
for all such history propositions, a'Aa 1 =0, where is used to denote the null 
history proposition. Thus for all homogeneous history propositions so defined 
the first summation in Q is equal to zero. 

The simplest case is when all the d-consistent sets {a^-ia 1 } are a priori 
equally likely; so lets try this and see what happens in the case where iu, = -k 
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for all i — each <i-consistent set will be the corresponding null-countcrfactual with 
a priori weight A. Note that we don't yet call such weights 'probabilities'. In 
such a case we get: 

1 N 

p(a l \M) = -(p(a*\I) + J2p(a l A ^a 3 \I)). (10) 

Now we must ask what form p(a 1 A ~^a 3 \I) should take in terms of the 
decoherence functional. It is clear that, by intuition, the history proposition 
a 1 A ->a? is equivalent to a 1 . The proposition that the history proposition a 1 
is the case and the proposition that a? isn't the case is just equivalent to the 
proposition that the history proposition a 1 is the case. This can be shown 
explicitly in the History Projection Operator (HPO) form of CH [SJ. In the 
HPO formalism homogeneous history propositions are represented by tensor 
products of the relevant single-time propositions. So for two two-time history 
propositions we have that a 1 = al ® a\ 2 and a 3 = 6? t ® ct\ 2 . If we assume that 
these two history propositions are defined such that a\ _L 6? tl and a\ 2 L a\ 2 
then the proof that oti A ->aj = on goes as follows: 

&ti ® a\ 2 A -n(a4 ®al 2 ) := d' tl A ® aj 2 A -.a£ 

+ < A6ii®°L A ^ & i ( n ) 

= < ® a* 2 + < ® 6 + 6 ® aj a (12) 
= fiJi®^. (13) 

Ea. l|ll|l represents the intuitive logical result that the history propostion 
a 1 A ~^a 3 can be true in three different ways, namely if any one of the three 
history propositions on the RHS of Ea. i|ll|) is true. To get from Ea. (|12ll to 
Eq. (|13[) we simply note that all history propositions which have a null result at 
any given time are deemed equivalent to the null history = <g> [Hj. 

Thus, for homogeneous history propositions that are defined using exclusive 
and exhaustive single-time propositions, it is always the case that a 1 A ->a J = a 1 
and thus that: 

d(a z A -na j ,a l A ^a 3 ) = d{a\a l ). (14) 

Thus it seems natural to equate p{a l A^a 3 \I) with pia 1 \I) = d(d l , a 1 ) in this 
case. This gives us that the probability that history proposition a 1 is the case, 
given that context M is an equally weighted mixture of null-counterfactuals, is 
given by the rather trivial result: 



1 

p(a*\M) = -(p(a*|J) (15) 

= ±(p(a*\I) + (N-l)p(a i \I)) (16) 

= p(a l \I) = d{a\a l ). (17) 
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Note that this result does not depend upon the fact that we chose to use 
equal weights. Any set of positive weights Wi, such that Wi = 1, would work. 
This result should be taken with a pinch of salt, lots of implicit assumptions 
have been made in order to reach Ijl7(l — we will investigate it less naively in the 
next section once we have introduced a Bayesian account of such propositions. 
So, it is clear that some ordered sets of single-time von Neumann measurements 
might equally well be interpreted as a mixed set of null-counterfactuals — but, of 
course, not all ordered sets of von Neumann measurements could be interpreted 
in such a manner. 

Non-<i-realisable propositions are propositions that can never be d-realised 
with respect to some other history propositions in context I. So far we have 
not discussed the strict meaning of the context /; we have simply kept the 
context within the notation because of the tentative contextual meaning we 
apply to propositions. In [2] it was shown that doing such a thing can help 
clarify the meaning of probabilistic statements in SQT; we simply adopt the 
same principle here for quantum history theories. So, for the moment, one is 
asked just to accept the name 'J' for the the context whatever that context is 
taken to mean. We will, in part, rectify this gnomic situation later. 

So, strictly speaking, the above discussion is rather naive and we have yet to 
check that the reasoning we have used is all consistent and unambiguous. For 
example, it is not clear that the context I is well-defined globally throughout the 
mixing process. Or whether the mixing process is itself well-defined — especially 
since the weights are totally arbitrary. We shall examine this in the following 
sections. We will show that by using Bayesian reasoning such concepts do 
become consistent and less ambiguous. 

Bayesian Histories 

Bayes' rule is a rule that relates a priori probability statements to a posteriori 
probability statements. Say we have two propositions A and B and a general 
context D which refers to the general setup of the problem (and remains constant 
through the analysis) then Bayes' rule is as follows: 

Bayes' rule is derived from the following rule: 

p(A n B\D) = p(A\BD)p(B\D) = p(B\AD)p(A\D). (19) 

We can try and use Bayes' rules (or equivalently H19(| s | to analyse the reason- 
ing we used above. If we take all history propositions then one might be tempted 
to try and apply Bayes' rule to them and see if we get any form of consistency. 
So, let us apply Bayes' rule in the following naive way, simply using the history 
algebra 'A' instead of the standard Boolean 'n' (we will justify the step using 
Cox's axioms of probability later): 



p(u l f\^a ] \I) = p(a 4 hc^'7>(-a J '|/) (20) 
= p( r >ce'\a i I)p(c?\I). (21) 
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By intuition one might like to assign that p(-ia 3 \a l I) := 1 because if the 
proposition a 1 is true then obviously the proposition -tap is true. This then 
gives us that: 

p(a l A^'|J)=p(ai/). (22) 

The above analysis is consistent as long as Bayes' rule is valid for such history 
propositions, so we need to work out if Bayes' rule is a valid way to manipulate 
the probabilities of history propositions. For this to be the case then all the 
above probabilities must be well-defined. We will justify our naive application 
of Bayes' rule later, but for now let us continue along with this naive analysis 
for a moment and see how Bayes' rule (or equivalently rule Ijl9|l ) apply to the 
other probability assignments we might like to make. 



p{a l l\a 3 \I) = p{a i \a 3 I)p(a j \I) = {) (23) 
= p{a 3 \a l I)p{a 3 \I) = 0. (24) 

The statement (|23H is intuitively the case for disjoint history propositions 
since if a 3 is the case then a 1 isn't the case, and similarly for the second decom- 
position l)24[l. Note that this doesn't presume that these two propositions are 
probabilistically exclusive, only that given one we never infer the other. 

p(-na l Aa j \I) = p( r a i \o^I)p{a i \I) (25) 
= P(a 3 \l). (26) 

To get to (|26[l we use exactly the same reasoning we used to get (|22|l . And 
so we come to ask how we interpret p(->a l A -<a 3 \I). One way to look at this 
probability is to decompose it as follows: 

p(^a l A^a 3 \I) = p{^a J \^a l I)p{^a'\I) (27) 
= (l-^ha'^W-a 1 !/) (28) 

= t-trm^w ^ 

= p(-.a i |J)-p(a , '|/). (30) 
But, of course, instead of using the decomposition (|27Jl one could have used: 

p(^a l A^a 3 \I) = p(^a i ho! J 'j)p(-.a- ? '|7) (31) 
= (l-p{a l \^a 3 I))p(^a j \I) (32) 

= (l-^|jyM^I^) (33) 

= p{^a 3 \I) -p{a l \I). (34) 

In order for the two ways of decomposing p(^a l A ~^a 3 \I) to be consistent we 
require that p(^a 3 \I) — p(a l \I) = p(-^a l \I) — p(a 3 \I). A necessary and sufficient 
condition for the history propositions to satisfy this requirement is that: 
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p(a l | J) + p(^a l | J) = X for all i, (35) 

where if is a positive constant. We call condition (|35|) quasi- realisability. When 
K = 1 we call the probabilities realisable. Note that in assuming steps H28JI 
and (|32|l are valid we must presume that the probabilities are realisable in an a 
posteriori sense in that p(a % \—>aP I) + p{-^a l \^a 3 1) = 1 for all i. 

A set {a 1 : i = 1,2,...,N} that does not satisfy l(3*5l) does not give equal 
decompositions (|27|l and i|31|) . Thus, in interpretation any complete set of 
homogeneous history propositions makes sense, but if we require consistency 
with Bayesian probability theory then we must at least discuss sets of history 
propositions which satisfy the stricter condition (JJSJ). 

So, if we identify that p{a l \I) — d(a l ,a' 1 ) and that p(^a l \I) = d(^a l , ^a l ), 
then a sufficient condition for all the above to be consistent by both decomposi- 
tions l|27(l and (|31|1 is that everything is d-realisable. If the history propositions 
weren't d-realisable then there is no a priori reason why decompositions l|27(l 
and (j31(l should match. However, are all these probabilities well-defined? All 
the probabilities that we identify with decoherence probabilities are obviously 
well-defined in the sense that they are bounded between and 1. But we haven't 
yet identified whether the naive conditional probabilities are all well-defined. 

For example, if we make the identification that p(-ia J \a l I) = 1 and then, 
using Bayes' rule, we derive: 

/ i, in P{a l \I) a') , . 

p[a ha J I) = — — — = — : — . (So) 

In order for the above to be bounded by and 1 we require that: 

< ,. v .' — '-- < 1. (37) 

If a 1 is more probable than ->a J in the context / then the above condition 
will not be satisfied. The next question we must ask is what types of history 
propositions do we require for this Bayesian analysis to be consistent? In terms 
of the HPO form of CH [H] we define that the the homogeneous history proposi- 
tions {a 1 : i = 1, 2, N} defined using exclusive sets of single-time propositions 
are all mutually disjoint: a 1 _L a J for all i,j such that i ^ j. In terms of the 
natural orthoalgebra of history propositions, this means that a 1 < ^a 3 for all 
i,j such that i ^ j. Thus if the decoherence functional preserves the partial 
order defined on the history proposition space then condition (|37|l is satisfied. 
Isham and Linden J2j have argued that this need not be the case; there are 
examples where the following is not true: 

a < d(a,a) < d(j3,0). (38) 

They give a specific example which disobeys The sum-over-paths for- 

mulation for SQT does obey and it is thus not clear whether we should 
assume it in general history theories |13j . In order to satisfy l|37|l . however, we 
must use sets of history propositions such that the following is satisfied: 

of < -na 3 =*> d(a\ a*) < d{^a ] ^a j ) for all i ^ j. (39) 
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Presuming {a 1 : i = 1, 2, iV} consists of d-realisable homogeneous history 
propositions that are defined using exclusive and exhaustive single-time propo- 
sitions then this Bayesian analysis is consistent as long as (|39|l is satisfied. If 
this is the case then the probabilities p(a l V a 3 \a k I) and p(a l V a? \^a k I) are 
well-defined (in the sense of being bounded by and 1) for all i,j,k. This can 
be seen just by invoking the conditional probabilities invoked above. 



p(a l Va?\a k I) := p{a l \a k I) + p(a j \a k I) - p(a l A a 3 \a k I) (40) 

Ek^i and k^j then the RHS of gUJ is + - = 0. If k = i ^ j then 
the RHS of gOj! is 1 + - = 1 and if k = j ^ i then it is + 1 - = 1. If 
k = i = j then the RHS is 1 + 1 — 1 = 1. This is all as we would expect by 
intuition. Similarly, 



p{a l V a 3 \^a k I) := p(a l \^a k I) + p(a 3 \^a k I) - p(a l A a j \^a k I) (41) 

is well-defined by construction. We can also define 'V' relations for the inhomo- 
geneous negations. 



p{a i \l^o?\u k l) := p(a l \a k I) + p{^a 3 \a k I) 

-p(u l A^a 3 \a k I). (42) 
p(a l V -na J \^a k I) := p(a i |-.a fe 7) + p(^a 3 \^a k I) 

-pfa'A^'lV/). (43) 
p(-^a l V ^a 3 \a k I) := p(^a* \a k I) + p(^a j \a k I) 

-p{^a l A^a 3 \a k I). (44) 
p(^a' V ^a j \^a k I) := p^a* \^a k I) + p(^a 3 \^a k I) 

-p(-« ! A-a J |V/). (45) 

All the above, by construction, give answers consistent with classical proba- 
bilistic intuition as long as d-realisability and i|39|) are satisfied. So, if we discuss 
exhaustive sets of d-realisable propositions such that: 

v(a l \I) 

p(a l \^a 3 I) = 7 ' '- < 1 for all i,j such that i ^ j, (46) 
P{->ot?\I) 

then we can, by construction, get complete consistency with Bayesian rea- 
soning. History propositions a 1 and a J within such a set are additive over 
all conditional probabilities even if they are not additive a priori such that 
p{a % V a J \I) ^ p(a l \I) + p(a 3 \I). But is there anything wrong with two propo- 
sitions being additive in one context and not additive in another? Of course 
not. In the Bayesian framework, probabilities are always defined contextually 
PP and exclusivity is a contextual property of propositions. 

This is not to say that quantum probabilities definitely don't behave in ways 
that go against classical intuition, only that classical Bayesian probability theory 
might take us a little further than we may have thought in analysing quantum 
history propositions. This approach, which we call Bayesian Histories (BH), 
has a clear pedagogical basis and, as we shall argue below, may tentatively be 
experimentally distinguishable from SQT. 
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So, using BH, we can define history propositions to be exclusive in certain 
contexts. But, of course, we can identify these contexts either as a priori ones 
or a posteriori ones in reference to Bayes' rule (|18|) depending upon what stage 
of the Bayesian updating process we are considering. 



A Pedagogical Account of Additivity 

As we discussed above, and recently emphasised by Mana propositions have 
certain properties that are contextual. The exclusivity of propositions is a con- 
textual property of propositions. Therefore if we have two propositions A and 
B it is not necessarily the case that they are exclusive in any given context (nor 
even defined in any given context). We have defined above that the exclusivity of 
two propositions arises when p(AnB\D) = andp(AL)B\D) = p(A\D)+p{B\D) 
such that all probabilities are well-defined (this happily coincides with the stan- 
dard Bayesian notion of exclusivity). In a similar way, consistent historians 
define 'd-consistency'; although there is a subtle distinction between the two. 
In CH contexts are defined to be situations in which d-consistency occurs, where- 
as in BH contexts are far more general. The exclusivity of propositions might 
be gained when going from a priori probabilities to a posteriori probabilities. 
So, for propositions A and B and prior- information D it might be the case that: 

p(AU B\D) ^ p{A\D) +p{B\D) (47) 
even though when we update using further information E it is the case that: 

p{AUB\ED)=p(A\ED)+p(B\ED). (48) 

This is a possibility we can imagine since exclusivity is a contextual property 
of propositions. One might be able to define contexts which give additive a 
posteriori probabilities using BH, rather than restricting ourselves to additive 
a priori probabilities (as one might put it when using CH). 

Using Bayes' rule we can also naively derive the following rule: 



p(A\(DUE)F) = P( D ^ E \ AF ) p{ P ^E\F) (49) 

p(D\AF)p(A\F) +p(E\AF) p(A\F) 

(50) 



p(DUE\F) 
p(D fl A\F) +p{EnA\F) 
p(DUE\F) ' 



(51) 



We get to (|50|) as long D and E are additive on the a posteriori context AF. 

Throughout the analysis that gave us (|17fl we assumed that the context / 
is well-defined and globally applicable to each null-counterfactual. This is an 
assumption that need not be valid. For example, one could either make the 
association that p{a l \C) = d(a l ,a l ) — identifying the decoherence functional 
with a priori probabilities — or one could associate the decoherence functional 
with a posteriori probabilities: 

p{a l \(a k V -^a k )C) = p(a l \l k C) = d{a\a l ). (52) 
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Now, if we associate the decoherence functional with a posteriori probabil- 
ities then such probabilities are independent of the context l fc in which they 
are taken. This is a kind of non-contextuality. Even if the values are the same, 
however, they may still behave differently — probabilities with the same value 
are not necessarily the same probabilities. Thus we should keep their notational 
dependence upon context even if their values are the same. When can one 
interpret each l fc as a null-counterfactual? By l|51|l we have: 

P (a^C) = E^m+P^Aj^. (53 ) 
p(l fc |C) 

So, if we associate the decoherence functional probabilities with a posteriori 
probabilities rather than a priori probabilities then we have that: 

p(a \1 C) = d(a ,a ) = p(l*\C) ' ^ ' 

This means that p(a l \l k C) 7^ p{a l \C). So, even if probabilities don't depend 
upon the contexts l k C, probabilities still depend on whether such a context 
is known to be the case or not. A priori probabilities are not the same as 
a posteriori probabilities. It is rather natural to make the association that 
/ = l k C and hence why we must differentiate between C and I in the above 
presentation. We reserve the the name 'C for a priori contexts. Thus all 
the naive probability assignments given in context / can be passed across to 
probability assignments in contexts l k C for all k. 

Using l|51(l we can discuss the probabilities assigned to an exhaustive set of 
contexts Vfcl fc : 



p(Vfel fc |c7) 



p(a*\l k C)J2p(l k \C) = d(a\a l ). (56) 



Therefore, a set of contexts {l k } that are exhaustive on C gives us the 
standard probabilities predicted by SQT. 

One might now ask how the a priori probabilities behave. We presume 
that p{\ k \a' l C) = 1 for all k since such conditional probability assignments 
are natural. Thus we have that ratios of a priori probabilities and ratios of a 
posteriori probabilities are equal, for example: 

pMC) _ p(a*\l k C) 



p(^a k \C) p(^a k \l k C)' 

In order for the probabilities p(d l A^a k \C) to be well-defined we thus require 
that such ratios are less than 1. This is thus equivalent to requiring (j39(l . In 
order for probabilities p(^a l A ^a k \C) to be consistent with Bayes' rule we also 
require that the a priori probabilities are quasi-realisable: 

p{a l \C) + p(^a l \C) = L for all i (58) 
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where L is a constant. Since we are assuming that the a posteriori probabilities 
are independent of contexts we require that p(a l \l C) = p(a l \VC). We thus 
have that p(V\C) = p(l k \C). This suggests that all contexts 1 are a priori 
equally likely so: 

p(l*|C) =L' for alH (59) 

where L' is a constant. Comparing a posteriori and a priori probabilities we 
have that: 

p { a^C) +P (^C) = P{all %^ lC) = i = K for all , (60) 

This is thus completely consistent with our requirement that the a posteriori 
probabilities must be quasi-realisable Thus if jj = K = 1 then we have 

d-realisable history propositions for all i. If we have K 1 then we have a quasi- 
d-realisable set of history propositions. If L = L' then we have a very cogent 
interpretation: all the contexts that we invoke consist of (i-consistent sets and 
are thus what we have called null-counterfactuals. So, if L — L' ', we represent 
experiments using an equally weighted mixture of null-counterfactuals. When 
if ^ 1 we don't have a good interpretation so, for now, we reject such cases. 

We have a sound interpretation for a posteriori contexts when K = 1, but 
what does the a priori context C refer to? We don't interpret C here except 
to say that if Bayesian probability is the correct probability to use then we 
must require that such a priori contexts are consistent with Bayes' rule (|18|1 . 
C is simply some context in which the a priori probabilities are well-defined. 
C is our knowledge about {l fc } and our knowledge about {l fc } is that we don't 
know which l k happens, so we apply equal a priori probabilities. The standard 
von Neumann collapse formulation predicts that all probabilities for multi-time 
measurements are well-defined, but in BH only those that give consistency with 
Bayesian reasoning are valid. Thus the collapse hypothesis is not deemed univer- 
sally valid in BH — it is rather only a convenient hypothesis in certain situations. 

Lets look at a standard interference device: a Mach-Zchnder interferometer. 
In the standard interpretation there are two possible history propositions which 
end in detection by a given detector labelled e — these histories we call a u and 
a d — and SQT predicts that each one happens with probabilities given by the 
decoherence functional: d(a u ,a u ) and d(a d ,a d ) respectively. We interpret a u 
to be the history proposition that the particle takes the upper path and a d as 
the proposition that it takes the lower path. Thus, in the standard interpre- 
tation, the probabilities given by the decoherence functional using these two 
propositions represent the situation where the path of the particle is measured. 
Interference suggests that: 

d(a u V a d , a u V a d ) ± d(a d , a d ) + d(a u , a u ). (61) 

This means that, in the standard interpretation, when you don't measure the 
path you predict a different probability at the detector to that you would predict 
had you measured the path. One can loosely say then that in one 'context' the 
histories are exclusive and in another they are not, but how do we formalise such 
notions? It is clear that in the space of history propositions it is not the case 
that -<a u = a d . We must be more subtle in our use of the negation operation. 
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Using interpretation[21we look at this path detection experiment in a subtly 
distinct fashion. There are two possible null-counterfactuals 1" — a 11 V n«" and 
l d = a d V -ia d (lets presume explicitly that a u and a d are both d-realisable 
since we have a good interpretation for such propositions). Using (|5l|) we make 
the association: 



p(a u A a d \C) + p(a u A ^a d \C) 
p(l d \C) 

0+p{a u \^a d C)p(^a d \C) _ p{a u \C) 
p(l d \C) ~ p{l d \C) ' 



P (a u \l d C) = ^ — '7,^ ' ; (62) 

(63) 



We can do this as long as the probabilities for a d and ~^a d are well-defined 
and additive on the a posteriori context a u C . With similar provisos we can 
argue that: 



y 1 ; p(l"|C) ; 

- (65) 

When we do the experiment we have no a priori reason to expect one null- 
counterfactual to occur over the other so we assign equal weights to each, 
p(l u \C) — | = p(l d \C). Each null-counterfactual is deemed to be apt with 
these a priori probabilities. So the Mach-Zehnder experiment can consist of an 
equally weighed mixed set M of null-counterfactuals such that: 



p(a u A 1 U \C) +p(a u A l d \C) 

p(l u V l d \C) 
d(a u ,a u ). (67) 



Pi^\M) = ^ M ^T^ ' 7 (66) 



Thus we recover the SQT predictions for path detection as long as we use 
d-realisable history propositions which give a consistent Bayesian analysis. Oth- 
erwise we must use a different set of null-counterfactuals — the same set of null- 
counterfactuals can't give use the the case when path detection doesn't occur. 
We could also try to define the probability p(-ia u \M) and in order to do so we 
would require that the probability p(^a u A ~^a d \C) is well defined, and this re- 
quires quasi-realisability. So, for consistency of the reasoning we use we require 
at least quasi-realisability for both a priori and a posteriori probabilities — we 
require (|5%|l and (p?5|) respectively. 

We have investigated the situation where the path lengths are equal but, of 
course, one can easily introduce phase shifters into the arms of the interferome- 
ter. Note that the dynamics is invoked in the very definition of the decoherence 
functional so phase shifters would be represented by a change in the evolution 
between to times from standard unitary evolution to one including a change 
in phase: d — > d! . Obviously this would have no effect for the path detection 
experiments but it would have an effect on non-path detection experiments such 
that d'(a u V a d , a u V a d ) would depend on a phase factor. Note a u V a d = a e 
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where a e is the history proposition that the particle is detected by a click in 
detector e without any path detection. 

All this discussion about null-counterfactuals is perhaps rather controversial; 
it is based around mainly notational issues. We have invoked them here, how- 
ever, simply in an attempt to distinguish quasi-realisable and realisable histories. 
Even if one does not accept this null-counterfactual formalism we hope that you 
still take away with you the primary fact that consistency with Bayesian reason- 
ing produces a consistency condition; decoherence probabilities must be at least 
quasi-realisable and must satisfy l|39|l . and (i-realisable histories seem far less 
controversial than quasi-d-realisable ones. As to why we should use Bayesian 
reasoning in the first place, we shall get onto that in a moment once we have 
discussed linear positivity. 

Quasi-realisability vs Linear Positivity 

Having a rather natural Bayesian interpretation for complete sets of d-realisable 
history propositions, let us now discuss quasi-d-realisable history propositions 
that satisfy lj^5(l . These don't give a good interpretation so it is tempting just 
to reject them, but lets look a little closer at them. Non-d-realisable history 
propositions simply satisfy the inequality d(a, ->a) =^ 0. We have that: 

Re d(a,^a) = Re d LP (a) -d(a,a) (68) 

where d LP (a) is defined on homogeneous history propositions in a similar man- 
ner to the decoherence functional (c.f. Eq.JJJ): 

d LP (a) := tr(C aP ). (69) 

As long as they are positive the Re d LP (a) behave like probabilities. In the 
literature they are called Linear Positive (LP) probabilities and were originally 
promoted by Goldstein and Page ^1] as a less restrictive alternative to CH 
probabilities. Therefore d-realisable history propositions have the property that 
LP probabilities and decoherence functional probabilities have the same value. 

Quasi-realisability enforces: 

p LP (a l \l k C) + P LP (^a l \l k C) = K 1 . (70) 

Note, however, that LP probabilities are always, by definition, exhaustive when 
defined on a partition of unity ^\ a 1 = 1 so K' = 1 for all LP probabilities. So 
now we have a choice: either we attempt to interpret quasi-d-realisable propo- 
sitions or we extend our discussion to LP propositions. There are a couple 
of reasons why going the LP way is pedagogically interesting. Firstly, all LP 
probabilities are realisable — hence we don't need to worry about non-realisable 
probabilities cropping up and having to interpret them. We shall give another 
reason why we reject non-realisable propositions when we discuss Cox's axioms. 
Secondly, LP probabilities are explicitly non-contextual; their interpretation 
doesn't depend upon what other history propositions they are invoked with. 
This makes the non-contextuality assumption a bit more explicit such that LP 
probabilities do not depend upon which null-counterfactuals they are defined 
with respect to. So we can, rather naturally, define: 
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p LP (a l \l k C) = p LP {a l \VC) for all k, (71) 

i.e. its value is independent of the context, labelled by k, we use. It still depends 
on the fact that we have a well-defined context hence we keep the notational 
dependence upon context and don't remove it entirely. Even if one represents 
non-contextuality by assuming the contexts are all equivalent and called, say, /, 
then one still gets a consistency condition, namely quasi-realisability, that LP 
probabilities satisfy since they are realisable. This non-contextuality assump- 
tion is rather analogous to the Gleason non-contextuality (which can also be 
expressed in terms of null-counterfactuals) ; afterall, what is non-contextuality if 
not an assumption that, if you don't know which null-counterfactual you are dis- 
cussing, that you give each possible null-counterfactual equal a priori weighting. 
So, there may exist a theorem akin to Gleason's which shows our LP probability 
assignments to be uniquely defined by certain natural assumptions (although 
we would have to justify the LP set of history propositions before discussing 
such a theorem; we do not attempt such a thing here). 

So LP probabilities can then be interpreted in a way that is exactly analogous 
to the way we interpreted the complete sets of <i-realisable history propositions. 
In order for the Bayesian probability assignments to be well-defined probabilities 
bounded by and 1 then we must, in analogy with 139|) . require that all LP 
probabilities preserve the partial order on the history space for all LP history 
propositions: 

a 1 < => Re d LP (a l ) < Rc d LP {-^a 3 ) for all i ^ j. (72) 

This is satisfied for all LP history propositions. 

We can define K' = jttt in & n analogous way such that: 

p LP (a l \C) + p LP (^a l \C) = L" for all i (73) 

and 

p LP (l k \C) = L'" for all k. (74) 

For LP history propositions we have that K' = 1 and thus that L" = 
L'". Thus we interpret the a priori context C to be the knowledge that we 
have no knowledge about the contexts l k and thus assign them equal a priori 
probabilities. Thus we can, if we wish, extend BH to include all LP history 
propositions and not just d-realisable propositions (which are, of course, also 
LP). 

If BH is correct then it helps, in part, to 'explain' interference because the 
probabilities invoked obey rules that are consistent with our classical intuition. 
If BH is incorrect — if non-LP history propositions remain well-defined and ex- 
perimentally realisable — then we have a theory that obeys our classical intuition, 
to some extent at least, which SQT disobeys — this in itself would be a novel 
result. 

Why Bayes' Rule? 

Having shown that there is a certain amount of consistency between Bayes' 
rule and the LP formalism the following programme presents itself: perhaps 
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we can derive the LP probabilities by taking the history algebra and applying 
something akin to Cox's axioms |3] to this space. Cox's work derives probability 
theory over an underlying Boolean algebra using simple consistency conditions 
that a natural form of inductive reasoning should obey, so it seems natural just 
to try and apply a similar kind of reasoning with the history algebra. If such 
a proof is found and as long as the history algebra could then be justified by a 
priori means — or by simple physically justified axioms — then one would be able 
to prove that the LP formalism is just another kind of probability theory. 

It is clear, however, that our naive assumption of using Bayes' rule is justified 
by Cox's axioms. Cox's first axiom is that the probability of a statement con- 
ditional upon some hypothesis determines the negation of that same statement 
upon the same hypothesis. The second axiom, more relvent here, is that the 
probability that two statements are both true upon a given hypothesis is deter- 
mined alone from the probability of one of the statements conditional upon the 
given hypothesis and the probability of the other statement conditional upon 
the hypothesis conjoined with the presumption that the first statement is true. 
In our notation this is written schematically as: 

p(aA0\I):=F\p(P\I),p(<*\PI)] (75) 

where F is an arbitrary function to be determined that is sufficiently well- 
behaved for our purposes. 

The underlying algebra for history propositions is associative so the following 
statement is true: 

a A {(3 A 7) = (a A 0) A 7 = a A /3 A 7. (76) 

The above property l|76|) forces F not to be arbitrary and Cox proves that 
Bayes' rule is a consequence (Jaynes highlights a more general proof in £Q): 

p{am = pW) ■ ( } 

Since the associativity of 'A' (|76|l is valid for our quantum logics then Bayes' 
rule follows and the above work is justified to an extent. Note that for homoge- 
neous histories defined over the same temporal support we have that aA/3 = /3Aa 
so the histories equivalent of the multiplication rule (|19|l also follows in such 
cases. 

Thus, although the above analysis initially seems quite naive there is some 
truth to it — Bayes' rule, if nothing else, should be obeyed by any natural notion 
of probability by Cox's axioms. We have, however, yet to generalise Cox's 
other proofs to the HPO algebra of history propositions proper. This remains 
work in progress. It is clear that the decoherence probabilities (which are also 
the standard probability assignments invoked using von Neumann collapse), at 
least in the Hamiltonian formulation, need not always obey Bayes' rule — they 
can disobey lj^7(l for example and need not be realisable or quasi-realisable — so 
we have to restrict our attention to either ^-realisable histories or LP ones (or 
use some other assignment). 

A naive application of Cox's first axiom, 

p(-.a|7) := G[p(a\I)}, (78) 
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suggests we should use a fixed value of K, hence why we have restricted our 
attention to realisable histories. Hence we should use realisable probabilities 
that obey Bayes' rule — we should use LP probabilities. This approach may 
not be considered wholly satisfactory because we are not giving probability 
assignments to all histories but only the LP subset of the algebra. This is 
curiously analogous to the situation in Youssef's work |18| where, in deriving 
a form of SQT as an 'exotic' complex probability theory, he has to presume a 
subset where the standard real probabilities are manifested. We have placed 
the term 'exotic' in scare quotes because we prefer not to use the term. When 
invoking Bayesian reasoning there is no a priori reason probabilities should be 
real numbers (also see We only need to presume they are real when notions 

of relative frequency arc applicable. Hence we would rather call such theories 
just probability theories; there is nothing really 'exotic' about them. Hence we 
leave open the possibility of deriving the whole of the histories formalism in 
such a manner. We investigate such a possibility in forthcoming work. 

Experimental Differentiation 

So if we re-define multi-time measurements to be equally weighted sets of null- 
counterfactuals (due to some principle of insufficient reason) we can get all 
Linearly Positive (LP) probabilities. One might wish to take this very seriously. 
There are two tacks that we can take in regards to BH. Firstly, we could choose 
to use BH to discuss closed quantum systems. LP probabilities were originally 
promoted in this manner |14| because as soon as we discuss closed quantum 
systems then using Eq. Q to assign probabilities to history propositions simply 
becomes a postulate. Using the real part of Eq. (JHSJ to assign candidate prob- 
abilities is another, equally valid, postulate. And, of course, any rule that is 
distinct from the von Neumann projection postulate must be investigated very 
carefully Diosi ^Sj has argued that the LP probabilities should not be used 
as probability assignments because they are not consistent with the statistical 
independence of subsystems. Implicit in this critique is the use of a relative 
frequency interpretation of probability but it is not clear that using a relative 
frequency interpretation for closed systems is wholly sound. The only other 
option we have (propensities being simply objectivised relative frequencies) is 
to use Bayesian probabilities; and if we do use Bayesian probabilities then LP 
probabilities are promoted over decoherence probabilities as they have a very 
simply interpretation and obey Cox's axioms for the LP subset. Bayesian prob- 
ability theory encompasses the use of relative frequencies in certain situations 
PP so there is nothing necessarily untoward about this. 

Secondly, we could try and apply BH to actual experiments. Anastopoulos 
suggests, in 0], that we should try and experimentally check that ^-inconsistent 
sets do really make good statistical sense. With a similar emphasis it may also 
be prudent just to check whether non-LP history propositions do really make 
good statistical sense in quantum experiments. But, of course, what do we 
mean by "good statistical sense"? If we assume that "good statistical sense" 
is equivalent with the statement "is consistent with Bayes' rule" then BH is 
promoted as a tautology. Otherwise one must use a form of statistics that is 
inconsistent with Bayes' rule and is also well-defined. 

So, it is clear that, at present, the only way we know how to get relative 
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frequencies out from quantum history theories is by discussing d-consistent sets. 
Those sets that aren't d-consistent may not give well-defined notions of relative 
frequency 0] . This might be because such relative frequencies aren't convergent 
or converge to many different values ^Bj- Of course, if relative frequencies 
converge to many values then the most natural interpretation is to suggest that 
we are inadvertently or necessarily mixing contexts. Hence we should, as argued 
above, be very careful about the notation we use and include any contextual 
dependence in the very definition of the probabilities involved. 

Hartle for example, has recently analysed the double slit experiment in 
reference to LP history propositions and shows that if the resolution of the screen 
is sufficiently high then the candidate probabilities predicted using the real 
part of Ea. ltj9|) will not be well-defined. But resolution coarser than a critical 
value will give well-defined LP probabilities. How seriously should we take this? 
Normally such probabilities are interpreted in closed systems but should we not 
just check that these aren't compatible with the relative frequencies of actual 
experiments? We have argued that LP probabilities can be interpreted in a 
particularly Bayesian way; the next challenge for BH is thus to try and work 
out how such Bayesian probabilities are related (obviously such a relation might 
be non-trivial) to the relative frequencies of experiments — as this might provide 
a way to experimentally distinguish the two approaches. There are a variety of 
ways one can derive relative frequencies from Bayesian probabilities; for example 
one can invoke notions of exchangability, independence or use maximum entropy 
methods pQ. Statistical independence in history theories has been studied by 
Diosi JS] and discussed by Hartle ^H] but there may be other useful ways to 
invoke relative frequencies from Bayesian probabilities. 

To the present author, it is tempting to believe in BH simply for the cogency 
of the interpretation, ft uses standard notions of Bayesian probability that are 
well understood and it pedagogically invokes the contextuality implicit in the 
propositional nature of history propositions. Although, of course, further inves- 
tigation and statistical analysis of experiments are necessary to justify it above 
the standard interpretation. In the standard interpretation any ordered set of 
single-time measurements is realisable regardless of problems of non-additivity 
(presuming the relevant apparatus can be made). In BH, only those multi-time 
measurements that give well-defined a posteriori and a priori probabilities are 
experimentally realisable with good statistics. Thus the standard interpretation 
and BH give distinct statistical predictions when interpreted instrumentally. 
But, of course, the instrumental validity of BH bares little relation to whether 
BH should be invoked when discussing closed quantum systems. In closed sys- 
tems probability is implicitly used as a form of inference rather than as relative 
frequencies of experiments so one should naturally use Bayesian probability. 
We have shown that all LP probabilities are consistent with Bayesian reason- 
ing, whereas not all probabilities of the form Q are (when using the natural 
space of history propositions). 

Entropy 

By invoking contexts in which history propositions are exclusive and exhaus- 
tive we now have the opportunity to use standard Shannon entropy to quantify 
information. For example, if a set {a 1 : i = 1, 2, N a } is exclusive and exhaus- 
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tive on a priori context C then we can define the the set's Shannon entropy 
in a simple way. Here we denote probabilities with a small p and probability 
distributions with a large P. The Shannon entropy is then simply given by: 

N a 

H[P{a l \C)] := -K H ^2p(a l \C)lnp(a*\C). (79) 

i=i 

where Kh is a constant. We cannot define such an entropy for sets {a 1 } that 
aren't exhaustive and exclusive on C . But we can define an entropy for them if 
we take an a posteriori context I in which {a 1 } are exclusive and exhaustive: 

H[P(a l \I)] = (80) 

i=i 

In [2] , Mana has cogently argued against improper use of such entropy con- 
cepts in SQT. Since we have used standard Bayesian probability and kept con- 
tcxtuality in check we can use Mana's pedagogical results in the histories domain 
as well. As such, if we have two sets of propositions {a 4 } and {/?-'} that are 
exclusive and exhaustive in the same context I — they are both 'sets of alterna- 
tives' in / — then we can define the conditional entropy as follows: 

N a 

H[P{P\a l I)] := -K H J2p(^\I)H[PW j \^I)\ (81) 

i=i 

N a N 

= -K H ^p{a l \I)^p{l3 3 \a l I)\np{fi j \a l I). (82) 
i=i 3=1 

An analogous definition is used for H[P(a i \/3^I)]. In such a case the following 
standard formulae should apply by mathematical necessity 0: 

H[P{a l Ap 3 \I)} = H[P(a l \I)]+H[P(i3 3 \a l I)] (83) 
= H[P(P\T)]+H[P{a^I)] (84) 

H[P{P\I)] > H[P{p' J \a l I)]. (85) 

These are the standard strong additivity and concavity properties of Shan- 
non entropy. We can avoid any of the confusions highlighted by Mana [5] by 
using such standard definitions of Shannon entropy. 

If we interpret multi-time measurements as equally weighed mixed sets of 
null-counterfactuals then such entropy concepts allow us to compare c?-consistent 
or LP sets entropically. This is not particularly useful if one interprets history 
propositions in the standard quantum cosmological manner but, of course, it 
may be very useful when interpreting quantum history propositions instrumen- 
tally. The reader is also referred, in earnest, to a Bayesian derivation of entropic 
concepts given recently by Caticha [TTj . 
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Future Research 



Isham's seminal work on CH and topos theory [201 pre-empts the idea that d- 
inconsistent sets can be assigned a certain amount of meaning using a notion of 
d-accessability which is related to our definition of d-realisability 2 . The present 
work can be considered a pedagogical account of such toposophic concepts in 
the domain of instrumental quantum theory, which shows that such a gener- 
alisation provides different statistical predictions. Isham also argued that it 
is pedagogically useful to discuss ^-consistent Boolean algebras rather than d- 
consistent sets per se because such objects are more akin to what we think of in 
classical probability theory. We agree with this sentiment (although we didn't 
submit to it here because of the useful illustrative notion of null-counterfactual) 
and the above work can easily be framed as such: one can define Boolean sub- 
algebras W that consist of history propositions; W = {a 1 : i = 1,2, ....,M} to 
be d-consistent if [2U| : 

d{a* A a\ a 1 Aa j ) = d{a\a 3 ) for all a\ aP € W. (86) 
Furthermore, in our notation, we can ask that: 

p(a i \/a j \I) = p{a l \I)+p(a 3 \I)-p{a i Aa 3 |/) for all a\a J £ W. (87) 

The extended definition of entropy for not-necessarily exclusive events given 
by Cox jjjj can also be applied to such propositions in a Boolean algebra as long 
as Q87JI is satisfied — when such propositions are not-exclusive they would be 
not-exclusive in the same way that classical propositions can be not-exclusive, 
ft might also be pedagogically useful to generalise single-time von Neumann 
null-counterfactuals to Boolean algebras proper. One can discuss more general 
contexts in which d-inconsistent Boolean algebras are consistent — in an a poste- 
riori sense — with rules (|18f) and (|19|l . The present author is not yet sure exactly 
how such toposophic concepts are related to BH; this is left for further research. 
Nor is it clear how such concepts pass across to LP history propositions. 

Operational notions |^ |22] such as Positive Operator Valued Measures 
(POVMs) provide a generalisation of von Neumann single-time measurements 
in the sense that each POVM defines a set of propositions that are apt in a mea- 
surement with certain probabilities. As such, it is easy to imagine a operational 
generalisation of the above work (see 123 El)- I n the POVM formalism single- 
time propositions are represented by effect operators that need not necessarily 
be orthogonal. When discussing such operational notions it is important to dis- 
tinguish between 'orthogonal' propositions and 'exclusive' ones — POVMs can 
consist of non-orthogonal propositions but these propositions are interpreted 
to occur exclusively regardless of whether they are orthogonal or not. In SQT 
we can prepare a mixed state of non-orthogonal pure states such that each 
pure state occurs exclusively in the mixed state with a given weight; similarly 
POVMs can consist of exclusive non-orthogonal propositions. So, if we interpret 
the outcomes of POVMs to happen exclusively, a generalisation into the multi- 
time domain that is compatible with interpretation [21 might be possible. Such 
a multi-time generalisation, however, would require a logic to the set of effect 

2 Note that our use of the term 'd-realisable' in not the same as its use in 1201 . 
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operators akin to the quantum logic of projection operators. Time evolution in 
the POVM formalism is more general than unitary evolution which might add 
an extra complication. 

One is also tempted to apply such null-counterfactuals to Bell-like experi- 
ments. It is exactly a cogent notion of a null-counterfactuals that is lacking in 
such analyses |25| . By describing such experiments in terms of realisable sets of 
history propositions one might nullify any proofs of nonlocality. We have shown 
above that there is no a priori reason why all multi-time candidate propositions 
made out of single-time propositions should be well-defined consistently. Simi- 
larly, there is no a priori reason why candidate null-counterfactual propositions 
about multiple spacelike separated spacetime regions must all be well-defined 
(as is implicitly assumed in and criticised by [23]). By using interpreta- 
tion [21 null-counterfactual statements might necessarily be statements about 
both spacelike separated regions and cannot be well-defined for individual small 
spacetime regions. This would, tentatively, be a way to argue against the EPR 
paper [SJ] in a way akin to Bohr's response [2Hj- It may also be a way to pro- 
mote Bayesian probability over relative frequencies [291 13l)j . This is presently 
left for further research; as are relativistic generalisations of BH. 

Conclusions 

We have shown that the two interpretations of multi-time measurements given 
by Anastopoulos 0] can be distinguished by how they treat non-realisable his- 
tory propositions. If we assume that multi-time measurements consist of suc- 
cessions of single-time measurements then one gets non-additive (and thus non- 
exclusive) propositions — this is the standard interpretation of multi-time mea- 
surements. Alternatively, if we assume that multi-time measurements are made 
up of sets of exclusive and exhaustive history propositions (and recover single- 
time SQT when using single-time history propositions) then one promotes a 
more standard notion of probabilistic exclusivity. The latter interpretation 
seems cogent and it might be experimentally differentiated from the former 
by a statistical analysis of non-realisable propositions in experiments. If the 
probabilities of non-realisable propositions all remain well-defined then we must 
stick to the standard interpretation, but otherwise the latter novel interpretation 
would be promoted. Since the latter interpretation provides a certain amount of 
philosophical clarity over the former, it is worthwhile trying to experimentally 
distinguish the two. We justify our novel approach, in part, by invoking Cox's 
probability axioms on the history algebra and showing that Bayes' rule should 
be obeyed by any natural probability assignments. 
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